MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data

نویسندگان

  • Guillermo Barturen
  • Antonio Rueda
  • José L. Oliver
  • Michael Hackenberg
  • Michael Stadler
  • Felix Krueger
  • Jörn Walter
چکیده

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants. We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs - Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP. MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data [version 2; referees: 3 approved]

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alig...

متن کامل

Best practices for SNV and methylation calling from bisulfite sequencing data

The advent of high-throughput sequencing techniques, together withbisulfite treatment of the DNA, allows for genome methylation profiling at asingle cytosine resolution. Some modern tools, as MethylExtract [1], are alsoable to detect variation (SNVs) using the same bisulfite sequencing library,which is crucial for many downstream analyses, for example when decipheringthe imp...

متن کامل

VM : a virtual machine for the integral analysis of bisulfite sequencing data

The analysis of whole genome DNA methylation patterns is an important first step towards the understanding on how DNA methylation is involved in the regulation of gene expression and genome stability. Previously, we published MethylExtract, a program for DNA methylation profiling and genotyping from the same sample. Over the last years we developed it further into a methylation analysis pipelin...

متن کامل

BS-virus-finder: virus integration calling using bisulfite sequencing data

Background DNA methylation plays a key role in the regulation of gene expression and carcinogenesis. Bisulfite sequencing studies mainly focus on calling single nucleotide polymorphism, different methylation region, and find allele-specific DNA methylation. Until now, only a few software tools have focused on virus integration using bisulfite sequencing data. Findings We have developed a new ...

متن کامل

A Bayesian Framework to Identify Methylcytosines from High-Throughput Bisulfite Sequencing Data

High-throughput bisulfite sequencing technologies have provided a comprehensive and well-fitted way to investigate DNA methylation at single-base resolution. However, there are substantial bioinformatic challenges to distinguish precisely methylcytosines from unconverted cytosines based on bisulfite sequencing data. The challenges arise, at least in part, from cell heterozygosis caused by multi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2013